DEPARTMENT OF ECONOMICS Probability Matching and Reinforcement Learning*
نویسنده
چکیده
Probability matching occurs when an action is chosen with a frequency equivalent to the probability of that action being the best choice. This sub-optimal behavior has been reported repeatedly by psychologist and experimental economist. We provide an evolutionary foundation for this phenomenon by showing that learning by reinforcement can lead to probability matching and, if learning occurs sufficiently slowly, probability matching does not only occur in choice frequencies but also in choice probabilities. Our results are completed by proving that there exists no quasi-linear reinforcement learning specification such that behavior is optimal for all environments where counterfactuals are observed. JEL Classification Number: C73.
منابع مشابه
Learning in Economics Experiments
Reinforcement learning, belief learning, experiments, probability matching, market price-choice games, computer simulations. This paper explains how simple psychological models of reinforcement and belief learning can be used to explain dynamic patterns of adjustment in economics experiments.
متن کاملReinforcement Learning by Probability Matching Philip
We present a new algorithm for associative reinforcement learning. The algorithm is based upon the idea of matching a network's output probability with a probability distribution derived from the environment's reward signal. This Probability Matching algorithm is shown to perform faster and be less susceptible to local minima than previously existing algorithms. We use Probability Matching to t...
متن کاملReinforcement Learning by Probability Matching
We present a new algorithm for associative reinforcement learning. The algorithm is based upon the idea of matching a network's output probability with a probability distribution derived from the environment's reward signal. This Probability Matching algorithm is shown to perform faster and be less susceptible to local minima than previously existing algorithms. We use Probability Matching to t...
متن کاملInformation Directed reinforcement learning
Efficient exploration is recognized as a key difficulty in reinforcement learning. We consider an episodic undiscounted MDP where the goal is to minimize the sum of regrets over different episodes. Classical methods are either based on optimism in the face of uncertainty or on probability matching. In this project we explore an approach that aims at quantifying the cost of exploration while rem...
متن کاملMap Matching with Inverse Reinforcement Learning
We study map-matching, the problem of estimating the route that is traveled by a vehicle, where the points observed with the Global Positioning System are available. A state-of-the-art approach for this problem is a Hidden Markov Model (HMM). We propose a particular transition probability between latent road segments by the use of the number of turns in addition to the travel distance between t...
متن کامل